Preparations

Load the necessary libraries

library(car) # for regression diagnostics
library(broom) # for tidy output
library(ggfortify) # for model diagnostics
library(sjPlot) # for outputs
library(knitr) # for kable
library(effects) # for partial effects plots
library(emmeans) # for estimating marginal means
library(ggeffects) # for plotting marginal means
library(MASS) # for glm.nb
library(MuMIn) # for AICc
library(tidyverse) # for data wrangling
library(modelr) # for auxillary modelling functions
library(DHARMa) # for residual diagnostics plots
library(performance) # for residuals diagnostics
library(see) # for plotting residuals
library(patchwork) # grid of plots
library(scales) # for more scales

Scenario

An ecologist studying a rocky shore at Phillip Island, in southeastern Australia, was interested in how clumps of intertidal mussels are maintained (Quinn 1988). In particular, he wanted to know how densities of adult mussels affected recruitment of young individuals from the plankton. As with most marine invertebrates, recruitment is highly patchy in time, so he expected to find seasonal variation, and the interaction between season and density - whether effects of adult mussel density vary across seasons - was the aspect of most interest.

The data were collected from four seasons, and with two densities of adult mussels. The experiment consisted of clumps of adult mussels attached to the rocks. These clumps were then brought back to the laboratory, and the number of baby mussels recorded. There were 3-6 replicate clumps for each density and season combination.

Format of quinn.csv data files

SEASON DENSITY RECRUITS SQRTRECRUITS GROUP
Spring Low 15 3.87 SpringLow
.. .. .. .. ..
Spring High 11 3.32 SpringHigh
.. .. .. .. ..
Summer Low 21 4.58 SummerLow
.. .. .. .. ..
Summer High 34 5.83 SummerHigh
.. .. .. .. ..
Autumn Low 14 3.74 AutumnLow
.. .. .. .. ..
SEASON Categorical listing of Season in which mussel clumps were collected ­ independent variable
DENSITY Categorical listing of the density of mussels within mussel clump ­ independent variable
RECRUITS The number of mussel recruits ­ response variable
SQRTRECRUITS Square root transformation of RECRUITS - needed to meet the test assumptions
GROUPS Categorical listing of Season/Density combinations - used for checking ANOVA assumptions

Mussel

Read in the data

quinn <- read_csv("../public/data/quinn.csv", trim_ws = TRUE)
glimpse(quinn)
## Rows: 42
## Columns: 5
## $ SEASON       <chr> "Spring", "Spring", "Spring", "Spring", "Spring", "Spring…
## $ DENSITY      <chr> "Low", "Low", "Low", "Low", "Low", "High", "High", "High"…
## $ RECRUITS     <dbl> 15, 10, 13, 13, 5, 11, 10, 15, 10, 13, 1, 21, 31, 21, 18,…
## $ SQRTRECRUITS <dbl> 3.872983, 3.162278, 3.605551, 3.605551, 2.236068, 3.31662…
## $ GROUP        <chr> "SpringLow", "SpringLow", "SpringLow", "SpringLow", "Spri…
summary(quinn)
##     SEASON            DENSITY             RECRUITS      SQRTRECRUITS  
##  Length:42          Length:42          Min.   : 0.00   Min.   :0.000  
##  Class :character   Class :character   1st Qu.: 9.25   1st Qu.:3.041  
##  Mode  :character   Mode  :character   Median :13.50   Median :3.674  
##                                        Mean   :18.33   Mean   :3.871  
##                                        3rd Qu.:21.75   3rd Qu.:4.663  
##                                        Max.   :69.00   Max.   :8.307  
##     GROUP          
##  Length:42         
##  Class :character  
##  Mode  :character  
##                    
##                    
## 

Since we intend to model both SEASON and DENSITY as categorical variables, we need to explicitly declare them as factors.

quinn <- quinn %>% mutate(
  SEASON = factor(SEASON, levels = c("Spring", "Summer", "Autumn", "Winter")),
  DENSITY = factor(DENSITY)
)

Exploratory data analysis

Model formula: \[ \begin{align} y_i &\sim{} \mathcal{Pois}(\lambda_i)\\ ln(\mu_i) &= \boldsymbol{\beta} \bf{X_i}\\[1em] \end{align} \]

where \(\boldsymbol{\beta}\) is a vector of effects parameters and \(\bf{X}\) is a model matrix representing the intercept and effects of season, density and their interaction on mussel recruitment.

ggplot(quinn, aes(y = RECRUITS, x = SEASON, fill = DENSITY)) +
  geom_boxplot()

Conclusions:

  • there is clear evidence of non-homogeneity of variance
  • specifically, there is evidence that the variance is related to the mean in that boxplots that are lower on the y-axis (low mean) also have lower variance (shorter boxplots)
  • this might be expected for count data and we might consider that a Poisson distribution (which assumes that mean and variance are equal - and thus related in a very specific way).

Lets mimic the effect of using a log link, by using log scaled y-axis.

ggplot(quinn, aes(y = RECRUITS, x = SEASON, fill = DENSITY)) +
  geom_boxplot() +
  scale_y_log10()

Conclusions:

  • that is an improvement

Fit the model

Model validation

Different model

Partial plots

Model investigation / hypothesis testing

Predictions

Summary figures

As these summarise only involve categorical predictors, there is no need to define a prediction grid. For categorical predictors, the default grid will assume that you are interested in all the levels of the categorical predictors.

References

Quinn, G. P. 1988. “Ecology of the Intertidal Pulmonate Limpet Siphonaria Diemenensis Quoy Et Gaimard. II Reproductive Patterns and Energetics.” Journalofexperimentalmarinebiologyandecology 117: 137–56.